智能论文笔记

Structure-Enhanced DRL for Optimal Transmission Scheduling

Jiazheng Chen , Wanchun Liu , Daniel E. Quevedo , Saeed R. Khosravirad , Yonghui Li , Branka Vucetic

分类：人工智能 | 机器学习

2022-12-24

Remote state estimation of large-scale distributed dynamic processes plays an important role in Industry 4.0 applications. In this paper, we focus on the transmission scheduling problem of a remote estimation system. First, we derive some structural properties of the optimal sensor scheduling policy over fading channels. Then, building on these theoretical guidelines, we develop a structure-enhanced deep reinforcement learning (DRL) framework for optimal scheduling of the system to achieve the minimum overall estimation mean-square error (MSE). In particular, we propose a structure-enhanced action selection method, which tends to select actions that obey the policy structure. This explores the action space more effectively and enhances the learning efficiency of DRL agents. Furthermore, we introduce a structure-enhanced loss function to add penalties to actions that do not follow the policy structure. The new loss function guides the DRL to converge to the optimal policy structure quickly. Our numerical experiments illustrate that the proposed structure-enhanced DRL algorithms can save the training time by 50% and reduce the remote estimation MSE by 10% to 25% when compared to benchmark DRL algorithms. In addition, we show that the derived structural properties exist in a wide range of dynamic scheduling problems that go beyond remote state estimation.

translated by 谷歌翻译

NumHTML: Numeric-Oriented Hierarchical Transformer Model for Multi-task Financial Forecasting

Linyi Yang , Jiazheng Li , Ruihai Dong , Yue Zhang , Barry Smyth

分类：机器学习 | 人工智能

2022-01-05

由于它存在的挑战以及甚至进行预测准确性或预测的潜在奖励，财务预测是机器学习研究的一个重要而活跃的机器学习研究领域。传统上，财务预测严重依赖于结构化财务报表的定量指标和指标。盈利会议呼叫数据（包括文本和音频）是使用深度盈利和相关方法的各种预测任务的重要非结构化数据的重要来源。但是，当前基于深度学习的方法在他们处理数字数据的方式有限;数字通常被视为普通文本令牌，而不利用其底层数字结构。本文介绍了一个以数字为导向的分层变压器模型，以预测库存退货，以及使用多模态对齐收益的财务风险通过利用不同类别的数字（货币，时间，百分比等）及其幅度来调用数据。我们使用现实世界公共可公共数据集介绍了对几个最先进的基线的NumHTML的全面评估结果。结果表明，NumHTML在各种评估指标中显着优于当前最先进的指标，并且它有可能在实际交易环境中提供重大的财务收益。

translated by 谷歌翻译

Robust Point Cloud Registration Framework Based on Deep Graph Matching(TPAMI Version)

Kexue Fu , Jiazheng Luo , Xiaoyuan Luo , Shaolei Liu , Chenxi Zhang , Manning Wang

分类：计算机视觉

2022-11-09

3D point cloud registration is a fundamental problem in computer vision and robotics. Recently, learning-based point cloud registration methods have made great progress. However, these methods are sensitive to outliers, which lead to more incorrect correspondences. In this paper, we propose a novel deep graph matching-based framework for point cloud registration. Specifically, we first transform point clouds into graphs and extract deep features for each point. Then, we develop a module based on deep graph matching to calculate a soft correspondence matrix. By using graph matching, not only the local geometry of each point but also its structure and topology in a larger range are considered in establishing correspondences, so that more correct correspondences are found. We train the network with a loss directly defined on the correspondences, and in the test stage the soft correspondences are transformed into hard one-to-one correspondences so that registration can be performed by a correspondence-based solver. Furthermore, we introduce a transformer-based method to generate edges for graph construction, which further improves the quality of the correspondences. Extensive experiments on object-level and scene-level benchmark datasets show that the proposed method achieves state-of-the-art performance. The code is available at: \href{https://github.com/fukexue/RGM}{https://github.com/fukexue/RGM}.

translated by 谷歌翻译

Magnetic Resonance Spectroscopy Deep Learning Denoising Using Few In Vivo Data

Dicheng Chen , Wanqi Hu , Huiting Liu , Yirong Zhou , Tianyu Qiu , Yihui Huang , Zi Wang , Jiazheng Wang , Liangjie Lin , Zhigang Wu

分类：机器学习

2021-01-26

磁共振光谱（MRS）是揭示代谢信息的无创工具。 1H-MRS的一个挑战是低信号噪声比（SNR）。为了改善SNR，一种典型的方法是用M重复样品进行信号平均（SA）。但是，数据采集时间相应地增加了M次，并且在公共环境M = 128时，完整的临床MRS SCAN大约需要10分钟。最近，引入了深度学习以改善SNR，但大多数人将模拟数据用作培训集。这可能会阻碍MRS应用程序，因为某些潜在差异（例如获取系统的缺陷）以及模拟和体内数据之间可能存在生理和心理条件。在这里，我们提出了一种新方案，该方案纯粹使用了现实数据的重复样本。深度学习模型，拒绝长期记忆（RELSTM），旨在学习从低SNR时间域数据（24 SA）到高SNR ONE（128 SA）的映射。对7个健康受试者，2名脑肿瘤患者和1名脑梗塞患者的体内脑光谱进行实验表明，仅使用20％的重复样品，RelstM的DeNoed Spectra可以为128 SA提供可比的代谢物。与最先进的低级别去核法相比，RELSTM在量化某些重要的生物标志物时达到了较低的相对误差和cram \'er-rao下限。总而言之，RELSTM可以在快速获取（24 SA）下对光谱进行高保真降级，这对MRS临床研究很有价值。

translated by 谷歌翻译

Data-Augmented Contact Model for Rigid Body Simulation

Yifeng Jiang , Jiazheng Sun , C. Karen Liu

分类：机器人 | 机器学习

2018-03-11

准确地对现实世界进行建模接触行为，对于现有的刚体物理模拟器而言，近刚毛的材料仍然是一个巨大的挑战。本文介绍了一个数据增强的接触模型，该模型将分析解决方案与观察到的数据结合在一起，以预测3D接触脉冲，这可能会导致刚体在各个方向上弹跳，滑动或旋转。我们的方法通过从观察到的数据中学习接触行为来增强标准库仑接触模型的表现力，同时尽可能保留基本的接触约束。例如，对分类器进行了训练，以近似静态摩擦和动态摩擦之间的过渡，而在碰撞过程中的非渗透约束在分析中执行。我们的方法计算整个刚体的触点的汇总效果，而不是分别预测每个接触点的接触力，而保持相同的模拟速度，而与接触点的数量增加了详细的几何形状。补充视频：https：//shorturl.at/eilwx关键字：物理模拟算法，动态学习，联系人学习

translated by 谷歌翻译

Cross Modal Transformer via Coordinates Encoding for 3D Object Dectection

Junjie Yan , Yingfei Liu , Jianjian Sun , Fan Jia , Shuailin Li , Tiancai Wang , Xiangyu Zhang

分类：计算机视觉

2023-01-03

In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection. Without explicit view transformation, CMT takes the image and point clouds tokens as inputs and directly outputs accurate 3D bounding boxes. The spatial alignment of multi-modal tokens is performed implicitly, by encoding the 3D points into multi-modal features. The core design of CMT is quite simple while its performance is impressive. CMT obtains 73.0% NDS on nuScenes benchmark. Moreover, CMT has a strong robustness even if the LiDAR is missing. Code will be released at https://github.com/junjie18/CMT.

translated by 谷歌翻译

Backdoor Attacks Against Dataset Distillation

Yugeng Liu , Zheng Li , Michael Backes , Yun Shen , Yang Zhang

分类：机器学习

2023-01-03

Dataset distillation has emerged as a prominent technique to improve data efficiency when training machine learning models. It encapsulates the knowledge from a large dataset into a smaller synthetic dataset. A model trained on this smaller distilled dataset can attain comparable performance to a model trained on the original training dataset. However, the existing dataset distillation techniques mainly aim at achieving the best trade-off between resource usage efficiency and model utility. The security risks stemming from them have not been explored. This study performs the first backdoor attack against the models trained on the data distilled by dataset distillation models in the image domain. Concretely, we inject triggers into the synthetic data during the distillation procedure rather than during the model training stage, where all previous attacks are performed. We propose two types of backdoor attacks, namely NAIVEATTACK and DOORPING. NAIVEATTACK simply adds triggers to the raw data at the initial distillation phase, while DOORPING iteratively updates the triggers during the entire distillation procedure. We conduct extensive evaluations on multiple datasets, architectures, and dataset distillation techniques. Empirical evaluation shows that NAIVEATTACK achieves decent attack success rate (ASR) scores in some cases, while DOORPING reaches higher ASR scores (close to 1.0) in all cases. Furthermore, we conduct a comprehensive ablation study to analyze the factors that may affect the attack performance. Finally, we evaluate multiple defense mechanisms against our backdoor attacks and show that our attacks can practically circumvent these defense mechanisms.

translated by 谷歌翻译

Language Models are Drummers: Drum Composition with Natural Language Pre-Training

Li Zhang , Chris Callison-Burch

分类：自然语言处理

2023-01-03

Automatic music generation with artificial intelligence typically requires a large amount of data which is hard to obtain for many less common genres and musical instruments. To tackle this issue, we present ongoing work and preliminary findings on the possibility for deep models to transfer knowledge from language to music, by finetuning large language models pre-trained on a massive text corpus on only hundreds of MIDI files of drum performances. We show that by doing so, one of the largest, state-of-the-art models (GPT3) is capable of generating reasonable drum grooves, while models that are not pre-trained (Transformer) shows no such ability beyond naive repetition. Evaluating generated music is a challenging task, more so is evaluating drum grooves with little precedence in literature. Hence, we propose a tailored structural evaluation method and analyze drum grooves produced by GPT3 compared to those played by human professionals, exposing the strengths and weaknesses of such generation by language-to-music transfer. Our findings suggest that language-to-music transfer learning with large language models is viable and promising.

translated by 谷歌翻译

Reference Twice: A Simple and Unified Baseline for Few-Shot Instance Segmentation

Yue Han , Jiangning Zhang , Zhucun Xue , Chao Xu , Xintian Shen , Yabiao Wang , Chengjie Wang , Yong Liu , Xiangtai Li

分类：计算机视觉

2023-01-03

Few Shot Instance Segmentation (FSIS) requires models to detect and segment novel classes with limited several support examples. In this work, we explore a simple yet unified solution for FSIS as well as its incremental variants, and introduce a new framework named Reference Twice (RefT) to fully explore the relationship between support/query features based on a Transformer-like framework. Our key insights are two folds: Firstly, with the aid of support masks, we can generate dynamic class centers more appropriately to re-weight query features. Secondly, we find that support object queries have already encoded key factors after base training. In this way, the query features can be enhanced twice from two aspects, i.e., feature-level and instance-level. In particular, we firstly design a mask-based dynamic weighting module to enhance support features and then propose to link object queries for better calibration via cross-attention. After the above steps, the novel classes can be improved significantly over our strong baseline. Additionally, our new framework can be easily extended to incremental FSIS with minor modification. When benchmarking results on the COCO dataset for FSIS, gFSIS, and iFSIS settings, our method achieves a competitive performance compared to existing approaches across different shots, e.g., we boost nAP by noticeable +8.2/+9.4 over the current state-of-the-art FSIS method for 10/30-shot. We further demonstrate the superiority of our approach on Few Shot Object Detection. Code and model will be available.

translated by 谷歌翻译

RELIANT: Fair Knowledge Distillation for Graph Neural Networks

Yushun Dong , Binchi Zhang , Yiling Yuan , Na Zou , Qi Wang , Jundong Li

分类：机器学习

2023-01-03

Graph Neural Networks (GNNs) have shown satisfying performance on various graph learning tasks. To achieve better fitting capability, most GNNs are with a large number of parameters, which makes these GNNs computationally expensive. Therefore, it is difficult to deploy them onto edge devices with scarce computational resources, e.g., mobile phones and wearable smart devices. Knowledge Distillation (KD) is a common solution to compress GNNs, where a light-weighted model (i.e., the student model) is encouraged to mimic the behavior of a computationally expensive GNN (i.e., the teacher GNN model). Nevertheless, most existing GNN-based KD methods lack fairness consideration. As a consequence, the student model usually inherits and even exaggerates the bias from the teacher GNN. To handle such a problem, we take initial steps towards fair knowledge distillation for GNNs. Specifically, we first formulate a novel problem of fair knowledge distillation for GNN-based teacher-student frameworks. Then we propose a principled framework named RELIANT to mitigate the bias exhibited by the student model. Notably, the design of RELIANT is decoupled from any specific teacher and student model structures, and thus can be easily adapted to various GNN-based KD frameworks. We perform extensive experiments on multiple real-world datasets, which corroborates that RELIANT achieves less biased GNN knowledge distillation while maintaining high prediction utility.

translated by 谷歌翻译